Active Learning to Classify Email

نویسندگان

  • Bryan Klimt
  • Shyamsundar Jayaraman
  • Yiming Yang
چکیده

While the technique of active learning has been applied successfully in improving text classification, its use in email classification has still not been explored. This paper examines several of the stateof-the-art algorithms for active learning with support vector machines as they are applied to email folder classification. We also introduce several extensions to these methods specifically designed to improve the quality of active learning when used for email folders. We evaluated the relative accuracy of these algorithms using a large publicly available email corpus. Our results show that current methods for active learning used in text classification work poorly for email foldering, but by taking chronological information, such as receipt time, into account, we can improve upon them significantly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptive Congestion Alleviating Protocol for Healthcare Applications in Wireless Body Sensor Networks: Learning Automata Approach

Wireless Body Sensor Networks (WBSNs) involve a convergence of biosensors, wireless communication and networks technologies. WBSN enables real-time healthcare services to users. Wireless sensors can be used to monitor patients’ physical conditions and transfer real time vital signs to the emergency center or individual doctors. Wireless networks are subject to more packet loss and congestion. T...

متن کامل

Learning to Classify Email into "Speech Acts"

It is often useful to classify email according to the intent of the sender (e.g., "propose a meeting", "deliver information"). We present experimental results in learning to classify email in this fashion, where each class corresponds to a verbnoun pair taken from a predefined ontology describing typical “email speech acts”. We demonstrate that, although this categorization problem is quite dif...

متن کامل

Learning to Classify Email into Speech Acts

It is often useful to classify email according to the intent of the sender (e.g., "propose a meeting", "deliver information"). We present experimental results in learning to classify email in this fashion, where each class corresponds to a verbnoun pair taken from a predefined ontology describing typical “email speech acts”. We demonstrate that, although this categorization problem is quite dif...

متن کامل

Active Learning with Boosting for Spam Detection

Spam detection algorithms have been developed to train in a large enough set of labeled data and predict with a high accuracy of 95% if an email is spam or not. A problem that arises in this setting is that labeling examples is a costly process. It requires humans to read them one by one and classify them. Active learning is a learning approach developed to address this problem. It learns a sma...

متن کامل

Email Classification and Summarization: A Machine Learning Approach

This paper presents the design and implementation of a system to group and summarize email messages. The system uses the subject and content of email messages to classify emails based on users’ activities and generate summaries of each incoming message with unsupervised learning approach. Our framework solves the problem of email overload, congestion, difficulties in prioritizing and difficulti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005